Skip to content

refactor(ce-code-review): lean apply model — safe self-apply interactive, report-only for callers#881

Merged
tmchow merged 20 commits into
mainfrom
refactor/ce-code-review-review-only
Jun 3, 2026
Merged

refactor(ce-code-review): lean apply model — safe self-apply interactive, report-only for callers#881
tmchow merged 20 commits into
mainfrom
refactor/ce-code-review-review-only

Conversation

@tmchow
Copy link
Copy Markdown
Collaborator

@tmchow tmchow commented May 30, 2026

What this does

ce-code-review does auto-fixes again — but lean. The old posture bolted fixing onto reviewing through a heavyweight mode:autofix mode, a safe_auto permission tier, and in-skill walkthrough / bulk-preview / ticket-defer machinery. This PR throws that machinery out and replaces it with a short, judgment-based act policy, organized around who owns the working tree:

  • Interactive (a human runs /ce-code-review): the review applies the safe, verified fixes it's confident in — test hardening, mechanical fixes — and reports them in an Applied section. When your working tree was clean before the review it commits them as an isolated fix(review): commit; on a dirty tree it leaves them for your commit (they fold into the work you were already committing). It never pushes. Safe fixes come off your plate — returned to a clean state, not left as a chore — and you stay the gate on the push.
  • mode:agent (a caller like ce-work / lfg runs it): the review is report-only — it mutates nothing and hands findings to the caller, which owns apply and the Residual Work Gate. Pipelines are no longer surprised by mid-run tree mutations.

Apply policy: bias to act on clear, reversible improvements; push back (don't apply) when a reviewer is wrong; defer anything that needs a design decision. The downside is controlled after the fact — reversible edits (a local commit reverts cleanly too), a visible diff, verify-then-keep (revert on red), and the push being the one outward step you own — not by a permission tier. Apply happens only on the tree that was reviewed (local-aligned / standalone), never on a remote-PR diff.

How it materially changed from the old posture

Dimension Before After
Fixing mode:autofix mode + safe_auto permission tier; apply gated by class No apply mode and no tier — a short act policy decides by judgment; legacy safe_auto remaps to gated_auto
Where fixing lives Inside the review skill (walkthrough, bulk-preview, ticket-defer sub-flows) Interactive self-apply for humans; caller-owned apply for automation
Skill weight SKILL.md 898 lines + ~510 lines of apply-machinery references SKILL.md 693 lines (−205 / −23%); machinery references deleted (walkthrough −249, tracker-defer −149, bulk-preview −112). Net the skill is ~690 lines lighter even after adding the new act policy + output skeleton
Per-run cost Heavier always-loaded SKILL.md; triage spun up walkthrough/preview/defer flows Leaner always-loaded skill = less context per review; the removed sub-flows no longer load at triage time
Output Headless text envelope; box-drawn / field-block findings that drifted by severity Deterministic mode:agent JSON contract; one ASCII-safe table shape (terse cell + keyed detail line) for every severity
Validation Could skip independent validation under load Independent per-finding validation that never drops a P0/P1 (over budget trims only the P2/P3 tail)

Net: the review is simpler, lighter, and faster to load, and it still takes the safe, tedious fixes off your plate — via a lean policy instead of a heavy mode.

Apply model (detail)

Before After
mode:autofix could edit the checkout No apply mode — interactive applies safe fixes automatically; mode:agent reports only
safe_auto tier + opaque in-skill autofix Apply is a judgment call (bias-to-act); legacy safe_auto remaps to gated_auto
Walkthrough / bulk-preview / in-skill ticket-defer Removed from the skill — apply and defer live in the caller (ce-work / lfg)
Headless text envelope in agent mode mode:agent JSON — one parseable object + review.json artifact
Applied fixes only via the autofix mode Applied section in interactive (safe, verified; committed as fix(review): on a clean tree, left for your commit on a dirty one); report-only in mode:agent

The review never pushes, opens PRs, or files tickets in any mode; in interactive mode it commits its applied fixes when the tree was clean before the review (the permanence gate is the push, not the local commit). mode:headless is a deprecated alias for mode:agent; mode:report-only is ignored.

Output quality & reliability

  • Validation that never drops criticals — an independent per-finding validation pass; when over budget it trims only the P2/P3 tail, never a P0/P1.
  • Calibrated findings — anchored confidence (50/75/100), cross-reviewer promotion, weak-signal demotion to soft buckets (testing_gaps / residual_risks).
  • One legible format everywhere — terse Issue cell + keyed detail line, the same shape for every severity, ASCII-safe pipe tables (no box-drawing, -> not unicode). The canonical skeleton lives in references/review-output-template.md and is loaded and mirrored at render time.
  • Machine contractmode:agent JSON is the deterministic surface for automation; the markdown report is the human view.

Orchestrator changes (ce-work, ce-work-beta, lfg)

The callers own apply, with a bias-to-act step (apply clear fixes; defer what needs a decision — never a conservative "when unsure, skip" filter):

  • Tier 2 flow: (2a) ce-code-review mode:agent → (2b) review-findings-followup.md applies findings (batched by file) → Residual Work Gate.
  • lfg mirrors the split with a self-contained lfg/references/review-followup.md; ce-work-beta carries its own copy of review-findings-followup.md (skills install as isolated directories — no cross-skill references).

Files

  • Removed from ce-code-review: references/walkthrough.md, references/bulk-preview.md, references/tracker-defer.md (the in-skill apply machinery)
  • Added: references/action-class-rubric.md; caller-owned review-findings-followup.md under ce-work / ce-work-beta
  • The interactive act policy, the Applied section, and the output skeleton live in ce-code-review/SKILL.md + references/review-output-template.md

Test plan

  • bun test tests/review-skill-contract.test.ts tests/pipeline-review-contract.test.ts (70 pass) — covers the apply model (report-only mode:agent vs interactive self-apply, commit-on-clean-tree, never push), the validation cap (P0/P1 never dropped), the terse-cell + keyed-detail-line format, stable numbering across Applied/severity/Actionable, and no mode:autofix / no deny-list
  • bun run release:validate in sync
  • CI test check on PR

Compound Engineering

tmchow and others added 6 commits May 21, 2026 18:26
…wers

Remove DHH/Kieran stack reviewers, fold structural checks into maintainability, suppress P3 findings in synthesis, and merge migration/schema-drift agents into a single conditional data-migration reviewer.

Co-authored-by: Cursor <cursoragent@cursor.com>
Extend data-migration Step 0 to diff db/structure.sql when present, and route plan deepening migration risks to ce-data-integrity-guardian instead of the PR-review persona.

Co-authored-by: Cursor <cursoragent@cursor.com>
Re-enable P3 findings in reports and synthesis after the persona refactor
temporarily suppressed them. Register removed migration and stack reviewers
in LEGACY_ONLY_AGENT_DESCRIPTIONS and EXTRA_LEGACY ce-* agent paths so
upgrades sweep stale flat installs.

Co-authored-by: Cursor <cursoragent@cursor.com>
Register removed review personas in STALE_AGENT_NAMES and
LEGACY_ONLY_AGENT_DESCRIPTIONS, and resolve current-agent lookup for
names that already include the ce- prefix so cleanupStaleAgents removes
flat installs like ce-dhh-rails-reviewer.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
…callers

Orchestrated workflows now invoke review for findings and own fix application separately, so ce-work and lfg can batch fixes without the review skill mutating the checkout.

Co-authored-by: Cursor <cursoragent@cursor.com>
Resolve ce-code-review Stage 6 conflict by keeping review-only output
(no Applied Fixes section, JSON schema for mode:agent).

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0346beae3b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-work-beta/references/shipping-workflow.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5391c57851

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/lfg/references/review-followup.md Outdated
…te (#881)

- Duplicate review-findings-followup into ce-work-beta for isolated installs
- Inline LFG apply bar in lfg/references/review-followup.md
- Replace stale headless text envelope with JSON contract in review-output-template

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9bee6ec5cc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md Outdated
Return status skipped in mode:agent when PR pre-checks bail out.
In pr-remote scope, fetch PR head ref and forbid workspace Read/Grep
for changed files; validators use git show or diff hunks only.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a2ec295fb5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md
Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md Outdated
Add branch-remote inspection rules for no-checkout branch reviews, use local tree diffs instead of appending gh pr diff when aligned on the PR branch, and sync ce-work-beta residual gate sentinels with ce-work.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3f22822d9b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md Outdated
…#881)

Two remaining gaps in the review-scope model surfaced by Codex review of
3f22822, both causing findings to be validated against the wrong tree:

- PR scope classification trusted the branch name alone, so a fork PR (or
  a stale local branch) sharing a name with the PR head was treated as
  local-aligned and diffed/inspected against the local checkout. Now
  local-aligned also requires the PR to be same-repo (isCrossRepository
  false) and the PR head commit to be an ancestor of local HEAD; metadata
  fetch gains headRefOid + isCrossRepository.
- Stage 5b validators never received the scope mode or remote head ref, so
  in pr-remote/branch-remote they fell back to Read/Grep on the stale
  workspace. Now the scope mode and <pr-head-ref>/<branch-head-ref> are
  injected into validator prompts (mirroring Stage 4), with inspection
  access scoped by mode. The validator template already consumed these.

Note: pre-existing CLI install test failures (tests/cli.test.ts) are
unrelated to this change and present on HEAD before it.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc9d30260a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md
@tmchow tmchow changed the title refactor(review): make ce-code-review review-only and split apply to callers refactor(ce-code0review): make ce-code-review review-only and split apply to callers Jun 2, 2026
@tmchow tmchow changed the title refactor(ce-code0review): make ce-code-review review-only and split apply to callers refactor(ce-code-review): make ce-code-review review-only and split apply to callers Jun 2, 2026
…ew in apply followup

Address PR review feedback (#881):

- ce-code-review: Stage 2 branch-mode intent discovery now runs
  `git log ${BASE}..<branch-ref>` using the resolved ref instead of the raw
  `<branch>` argument, so remote-only branch reviews read the right commit
  intent instead of failing or reading a stale same-named local branch.
- ce-work / ce-work-beta apply followup: consume the review already produced
  by Tier 2 step 2a instead of re-invoking ce-code-review; re-invocation is
  now a documented cold-caller fallback. Avoids a second full review per
  Tier 2 run. Updated the ce-work SKILL.md section anchor and the contract
  test to lock in the corrected (no-double-review) behavior.

Note: pre-existing failures in CLI install/cleanup/target-detection tests
(47, unrelated to these skill-content changes) not addressed by this PR.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 633576a3f9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-optimize/SKILL.md Outdated
…gs-table output

ce-code-review hardening prompted by a dogfood run that shipped findings in
the output template's own documented anti-pattern format and skipped
validation on the P1 findings:

- Stage 5b: resolve the "skip when >15 survivors" vs "validate top-15"
  contradiction. The stage no longer skips while any finding survives, and
  P0/P1 findings are always validated (raise the cap rather than drop a
  critical); only the P2/P3 tail is dropped when over budget.
- Stage 5b: sanction orchestrator direct verification of load-bearing facts
  (a pinned dependency's source, repo wiring) as a complement to the validator
  wave, using single-purpose native tools instead of chained shell.
- Stage 6: inline a literal findings-table skeleton in the always-loaded
  SKILL.md plus a concrete negative-signature tripwire (no Field: blocks, no
  box-drawing/middot separators, no lists) so the output contract survives a
  long session even when the template reference was not reloaded.
- Output format: state that mode:agent JSON is the deterministic cross-harness
  machine contract and default markdown is the human view; keep markdown
  ASCII-safe so it degrades gracefully across terminals and harnesses.
- Lock the P0/P1-always-validated invariant into the Stage 5b contract test.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0b4baf792c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/lfg/references/review-followup.md Outdated
Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md Outdated
The human-facing per-severity tables are now 5 columns
(# | File | Issue | Reviewer | Confidence). The synthesized route
(autofix_class -> owner) moves to the Actionable Findings table and the
mode:agent JSON, where it is actionable — keeping the scannable severity
tables narrow enough to render across terminals and non-Claude harnesses.

The finding set, routing, confidence gating, and JSON contract are
unchanged; this is presentation only. Updates the Stage 6 inline skeleton
and column description, the output-template examples and formatting rules,
the numbering fixture, and the primary-findings regex in the contract test.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 729b7fef99

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md Outdated
Re-introduce auto-fixing into ce-code-review as a lightweight, judgment-based
act policy (not the removed mode:autofix machinery). In default (interactive)
mode the review now applies the safe, verified fixes it is confident in and
reports them in an Applied section; in mode:agent it stays report-only and the
caller applies. It never commits or pushes — the human/caller owns permanence.

- Stage 5c "Act on findings (default mode only)": bias-to-act policy (apply
  clear reversible improvements; push back if the reviewer is wrong; defer what
  needs a decision), scope invariant (local-aligned/standalone only),
  verify-then-keep (revert on red), no auto-commit, and prominent surfacing of
  green-but-unverifiable edits (auth/contract/concurrency). No deny-list: a
  code-review fix is a reversible edit, so downside is controlled after the fact
  (revert + visible diff + commit checkpoint).
- Stage 6 + output template: Applied section (# | File | Fix | Reviewer +
  validation line); applied findings keep their stable # and stay out of the
  severity tables.
- Operating principles / Action Routing / autofix deprecation reframed:
  autofix_class is signal, never an apply gate; mode:agent never mutates.
- ce-work and ce-work-beta apply step reframed from a conservative
  "when unsure, skip" eligibility filter to the same bias-to-act policy.
- Contract test + numbering fixture cover the new behavior; skill doc updated
  (also corrected pre-existing drift: removed safe_auto and four-modes framing).

Plan: docs/plans/2026-06-02-001-feat-ce-code-review-safe-autofix-plan.md
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee7891edd3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md
Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md
…ix change

Simplification pass over ee7891e found stale/contradictory prose the feature
commit missed:

- SKILL.md "After Review": dropped the "Review-only handoff / do not edit
  project files" framing that contradicts Stage 5c default-mode apply; scoped
  apply vs report-only by mode.
- ce-work / ce-work-beta followup: scoped the "review-only / does not mutate"
  claim to the mode:agent invocation (true for the caller path) rather than
  asserting it of the skill as a whole.
- docs/skills + output template: removed the stale safe_auto promotion bullet
  and the deprecated report-only Mode-line value.
- contract test: de-duplicated the mutate-contract assertions across two tests
  (each invariant asserted once in its relevant test) and fixed a misleading
  comment.

Behavior preserved: contract tests 28/0; full suite at the 47 pre-existing
failures, zero new.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbc343925e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md
…er-time template load

Dogfood run showed the findings format still escaping to field-blocks for
detail-heavy P1s (while P2/P3 stayed tables) and a duplicate finding number on
a multi-file applied fix. Make the canonical output skeleton accommodate detail
so the format stops fighting the content:

- Findings: terse Issue cell (the scannable index) + a keyed detail line
  (`- **#N** — ...`) under each severity table for findings that need depth
  (usually P0/P1). One shape for every severity.
- review-output-template.md is the canonical copy-this skeleton; Stage 6 now
  instructs loading and mirroring it at render time (not just "see the
  template"), with the inline stub as the always-loaded fallback.
- Format gate now catches the real failure (inconsistent treatment across
  severities) and explicitly does not flag the keyed detail line.
- Applied section: a multi-file fix is one row with one # (no duplicate
  numbers); green-but-unverifiable edits flagged inline in the Fix cell.
- Tests + fixture cover the detail line, the multi-file one-row applied fix,
  and the consistency/load contract.

Behavior preserved: contract tests 29/0; full suite at the 47 pre-existing
failures, zero new.
The latest dogfood run kept the terse-cell + detail-line format (good) but
still packed full sentences into several P2 Issue cells. Replace the vague
"terse" guidance with a checkable bar: one short clause (~12 words or fewer,
no second sentence, no because/so/which explanation); the moment a cell wants
a comma-plus-clause or a reason, move the depth to the keyed detail line.
Applied in the SKILL.md Stage 6 skeleton + Findings step and the output
template; contract test asserts the named test is present.
@tmchow tmchow changed the title refactor(ce-code-review): make ce-code-review review-only and split apply to callers refactor(ce-code-review): lean apply model — safe self-apply interactive, report-only for callers Jun 3, 2026
tmchow added 2 commits June 2, 2026 19:21
…e commit

Refine the interactive apply model after design review. The permanence gate is
the push, not the commit: a local commit is private and reversible, and leaving
applied fixes uncommitted on a clean tree just creates a chore plus a window
where the user's next edits entangle with them.

- Stage 5c: when the working tree was clean before the review, commit the
  applied fixes as one isolated `fix(review):` commit; on a dirty tree apply but
  leave them for the user's commit (the fixes can't be isolated from WIP).
  Never push, open PRs, or file tickets.
- Operating principle, After Review, the Applied report step, the output
  template, the skill doc, and the plan all updated to commit-on-clean /
  never-push (was "never commits or pushes").
- Contract tests updated to the new contract.

bun test: contract 29/0; full suite at the 47 pre-existing failures, zero new.
… contract

Address 7 P2 review threads, several of them fallout from this branch's
apply-model changes:

- mode:agent is now described as report-only (skips Stage 5c apply), not
  "serialization only" — a runner could otherwise apply during mode:agent and
  the caller would re-apply the same findings.
- Deprecated mode:autofix removed from the conflict-stop list so it no longer
  contradicts the deprecated-mode "ignore and proceed" path.
- Agent-mode output: the markdown Actionable Findings summary is default-mode
  only; mode:agent carries actionable findings solely in the JSON field and
  emits nothing after the object.
- Agent-mode JSON: require one raw (unfenced) JSON object so naive JSON.parse
  callers don't choke on a leading code fence.
- pr-remote scope: resolve a real PR_BASE_REF (fetch baseRefName + rev-parse)
  and pass it to reviewers/validators so the data-migration persona's
  git diff <base> -- db/schema.rb schema-drift check works on remote PRs.
- ce-optimize: inline a local mechanical-apply bar instead of referencing
  ce-work's review-findings-followup.md (skills must be self-contained).
- lfg review-followup: make the post-fix push upstream-aware so a fresh branch
  with no upstream doesn't break the autonomous pipeline.

Contract test updated for the report-only reframe. bun test: contract 29/0;
full suite at the 47 pre-existing failures, zero new.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89cff2103f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-code-review/SKILL.md
PR #881 feedback: Stage 5b dropped a finding as "validator failed" on any
validator infrastructure failure (timeout / dispatch error / malformed JSON).
For P0/P1 that silently removes the highest-severity issues from the
report/actionable JSON on a transient failure — the opposite of safe, and now
that the validation pass always runs (default + mode:agent) it's reachable.

Now: P2/P3 still drop on infra failure (conservative bias); P0/P1 are kept and
their validation marked degraded (reported in Coverage). A genuine
validated:false rejection still drops at any severity.

bun test: contract 29/0; full suite at the 47 pre-existing failures, zero new.
@tmchow tmchow merged commit 3eedade into main Jun 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant